Jigsawing : A Method to Create Virtual Examples in OCR data
نویسندگان
چکیده
In this theoretical note we propose the use of a suffix tree on square matrices for compact representation of a set of training patterns. We show how a test pattern can be generated by jigsawing various regions from different training patterns. This in turn leads us naturally to a compact data dependent representation of a test pattern which we call the description tree. We envisage the use of the description tree in a variety of applications including nearest neighbor classifiers, data dependent distance norms, kernel methods and syntactic pattern recognition. We provide statistical learning theory based arguments to show that our method generates valid virtual examples and hence will lead to better classification accuracy.
منابع مشابه
Cross-efficiency evaluation by the use of ideal and anti-ideal virtual DMUs’ assessment in DEA
To rank Decision Making Units (DMU) in Data Envelopment Analysis (DEA), the peer-evaluation based cross-efficiency method is generally used. Indeed, in this method, each DMU is evaluated from the view point of other DMUs. In this article, a method is suggested which instead examines each DMU just by using the weights resulting from the evaluation of ideal and anti-ideal virtual DMUs and thus ex...
متن کاملThe New Neutral Secondary Goal based on Ideal DMU Evaluation in Cross-Efficiency
Cross-efficiency is a famous ranking method for data envelopment analysis (DEA) that deletes unrealistic weights pattern with no need to a priori information related to weights restrictions. This method analyzes each decision making unit (DMU) taking into account the best weights resulted from assessing other DMUs. In cross-efficiency evaluation, secondary goals such as aggressiveness, be...
متن کاملA Modfied Self-organizing Map Neural Network to Recognize Multi-font Printed Persian Numerals (RESEARCH NOTE)
This paper proposes a new method to distinguish the printed digits, regardless of font and size, using neural networks.Unlike our proposed method, existing neural network based techniques are only able to recognize the trained fonts. These methods need a large database containing digits in various fonts. New fonts are often introduced to the public, which may not be truly recognized by the Opti...
متن کاملChemistry and Biochemistry Training in Medical Sciences: The Need to Use Kinetic Schemas in Virtual Class
Many disciplines in the collection of medical sciences and engineering are based on the basis of chemistry. In order to continue teaching learners in the coronavirus disease situation and to continue the curriculum, various solutions have been proposed and presented, among which it is expected that using technology, the method of educators changes from the traditional approach. New ideas that l...
متن کاملA genetic algorithm for a bi-objective mathematical model for dynamic virtual cell formation problem
Nowadays, with the increasing pressure of the competitive business environment and demand for diverse products, manufacturers are force to seek for solutions that reduce production costs and rise product quality. Cellular manufacturing system (CMS), as a means to this end, has been a point of attraction to both researchers and practitioners. Limitations of cell formation problem (CFP), as one o...
متن کامل